首页> 外文OA文献 >Constructing liberal and conservative supertrees and exact solutions for reduced consensus problems
【2h】

Constructing liberal and conservative supertrees and exact solutions for reduced consensus problems

机译:构建自由和保守的超级树以及减少共识问题的精确解决方案

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This thesis studies two different approaches to extracting information from collections of phylogenetic trees: supertrees and reduced consensus. Supertree methods combine the phylogenetic information from multiple partially-overlapping trees into a larger phylogenetic tree called a supertree. Several supertree construction methods have been proposed to date, but most of these are not designed with any specific properties in mind. Recently, Cotton and Wilkinson proposed extensions of the majority-rule consensus tree method to the supertree setting that inherit many of the appealing properties of the former.We study a variant of one of Cotton and Wilkinson\u27s methods, called majority-rule (+) supertrees. After proving that a key underlying problem for constructing majority-rule (+) supertrees is NP-hard, we develop a polynomial-size exact integer linear programming formulation of the problem. We then present a data reduction heuristic that identifies smaller subproblems that can be solved independently. While this technique is not guaranteed to produce optimal solutions, it can achieve substantial problem-size reduction. Finally, we report on a computational study of our approach on various real data sets, including the 121-taxon, 7-tree Seabirds data set of Kennedy and Page. The results indicate that our exact method is computationally feasible for moderately large inputs. For larger inputs, our data reduction heuristic makes it feasible to tackle problems that are well beyond the range of the basic integer programming approach. Comparisons between the results obtained by our heuristic and exact solutions indicate that the heuristic produces good answers. Our results also suggest that the majority-rule (+) approach, in both its basic form and with data reduction, yields biologically meaningful phylogenies.Generalizations of the strict and loose consensus methods to the supertree setting, recently introduced by McMorris and Wilkinson, are studied. The supertrees these methods produce are conservative in the sense that they only preserve information (in the form of splits) that is supported by at least one the input trees and that is not contradicted by any of the input trees. Alternative, equivalent, formulations of these supertrees are developed. These are used to prove the NP-completeness of the underlying optimization problems and to give exact integer linear programming solutions. For larger data sets, a divide and conquer approach is adopted, based on the structural properties of these supertrees. Experiments show that it is feasible to solve problems with several hundred taxa and several hundred trees in a reasonable amount of time.A rogue taxon in a collection of phylogenetic trees is one whose position varies drastically from tree to tree. The presence of such taxa can greatly reduce the resolution of the consensus tree (e.g., the majority-rule or strict consensus) for a collection. The reduced consensus approach aims to identify rogue taxa and to produce more informative consensus trees. Given a collection of phylogenetic trees over the same leaf set, the goal is to find a set of taxa whose removal maximizes the number of internal edges in the consensus tree of the collection. This problem is proven to be NP-hard for strict and majority-rule consensus. We describe exact integer linear programming formulations for computing reduced strict, majority and loose consensus trees. In experimental tests, our exact solutions show significant improvement over heuristic methods on several problem instances.
机译:本文研究了两种不同的从系统发育树集合中提取信息的方法:超树和简化共识。超级树方法将来自多个部分重叠的树的系统发育信息组合成一个更大的系统树,称为超级树。迄今为止,已经提出了几种超级树构造方法,但是大多数方法在设计时都没有考虑到任何特定的属性。最近,Cotton和Wilkinson提出了将多数规则共识树方法扩展到继承前者许多特征的超级树环境的方法。我们研究了Cotton和Wilkinson方法的一种变体,称为多数规则(+ )超级树。在证明构造多数规则(+)超树的关键基本问题是NP-hard之后,我们开发了该问题的多项式大小精确整数线性规划公式。然后,我们提出一种数据归约启发式方法,该方法可识别可以独立解决的较小子问题。虽然不能保证此技术可以产生最佳解决方案,但可以大大减少问题的大小。最后,我们报告了对各种实际数据集(包括121个分类单元,7棵树的肯尼迪海鸟数据集和佩奇)的实际方法的计算研究。结果表明,对于较大的输入,我们的精确方法在计算上是可行的。对于较大的输入,我们的数据约简启发法使解决远远超出基本整数编程方法范围的问题变得可行。通过我们的启发式方法和精确解获得的结果之间的比较表明,启发式方法可以产生良好的答案。我们的研究结果还表明,多数规则(+)方法在其基本形式和数据减少的基础上都产生了生物学上有意义的系统发育树.McMorris和Wilkinson最近引入了将严格和宽松的共识方法推广到超级树环境的方法。研究。这些方法产生的超树在某种意义上是保守的,因为它们仅保留信息(以拆分的形式),这些信息至少由一个输入树支持,并且不受任何输入树的抵触。开发了这些超级树的替代等效项。这些用来证明基本优化问题的NP完备性,并给出精确的整数线性规划解。对于较大的数据集,基于这些超级树的结构属性,采用分而治之的方法。实验表明,在合理的时间内解决数百个分类单元和几百棵树木的问题是可行的。系统发育树集合中的流氓分类单元是一个位置,其位置因树木而异。这种分类单元的存在会极大地降低集合的共识树(例如多数规则或严格共识)的分辨率。减少共识的方法旨在识别恶意分类单元并产生更多有用的共识树。给定同一叶集上的系统发育树集合,目标是找到一组分类单元,去除这些分类单元可使集合的共有树中内部边缘的数量最大化。事实证明,对于严格和多数规则的共识,此问题很难解决。我们描述了用于计算简化的严格树,多数树和松散共识树的精确整数线性规划公式。在实验测试中,我们的精确解决方案在几个问题实例上显示出比启发式方法显着改进。

著录项

  • 作者

    Dong, Jianrong;

  • 作者单位
  • 年度 2012
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号